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REMARKS 

In a telephone interview on October 5, 2005, the examiner explained that claim 1 
appeared to be directed to a different invention than claims 8 and 9, and asked the applicant's 
representative to select one group. The applicant's representative David L. Feigenbaum selected 
the first group without traverse and agreed to an examiner's amendment to elect claims 1-3. 

The comments of the applicant below are each preceded by related comments of the 
examiner (in small, bold type). 

Claims 1 - 3 of this application are in condition for allowance 
except for the following formal remaining matter: 

The disclosure is objected to because of the following informalities: 
enclosure of appendix is improper. An appendix is limited to enclosure 
of a sequence listing table or a computer program listing (see MPEP § 
608.05). Otherwise, information contained within the appendix should 
be incorporated into the specification or filed through an IDS. 

A substitute specification incorporating information contained within the appendix has 
been provided. 

Canceled claims, if any, have been canceled without prejudice or disclaimer. 

Any circumstance in which the applicant has (a) addressed certain comments of the 
examiner does not mean that the applicant concedes other comments of the examiner, (b) made 
arguments for the patentability of some claims does not mean that there are not other good 
reasons for patentability of those claims and other claims, or (c) amended or canceled a claim 
does not mean that the applicant concedes any of the examiner's positions with respect to that 
claim or other claims. 
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No fees are believed due at this time. Please apply any other charges or credits to deposit 
account 06-1050. 

Respectfiilly submitted, 



Date: 
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DERIVING A PROBABILITY DISTRIBUTION OF A 
VALUE OF AN ASSET AT A FUTURE TIME 

BACKGROUND 
[001] This invention relates to generating and providing information 
about expected future prices of assets. 

[002] Among the kinds of information available at web sites on the 
Internet are current and historical prices and volumes of stock 
transactions, prices of put or call options at specific strike prices and 
expiration dates for various stocks, and theoretical prices of put and call 
options that are derived using formulas such as the Black-Scholes formula. 
Some web sites give predictions by individual experts of the future prices 
or price ranges of specific stocks. 

[003] A call option gives the holder a right to buy an underlying 
marketable asset by an expiration date for a specified strike price. A put 
option gives an analogous right to sell an asset. Options are called 
derivative securities because they derive their values from the prices of the 
underlying assets. Examples of underlying assets are corporate stock, 
commodity stock, and currency. The price of an option is sometimes 
called the premium. 

[004] People who buy and sell options are naturally interested in what 
appropriate prices might be for the options. One well-known formula for 
determining the prices for call and put options under idealized conditions 
is called the Black-Scholes formula. Black-Scholes provides an estimate 
of call or put prices for options having a defined expiration date, given a 
current price of the underlying asset, an interest rate, and the volatility rate 
(sometimes simply called volatility) of the asset. Black-Scholes assumes 
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constant interest rates and volatility, no arbitrage, and trading that is 
continuous over a specified price range. 

SUMMARY 

[005] In general, in one aspect, the invention features a method in which 
data is received that represents current prices of options on a given asset. 
An estimate is derived from the data of a corresponding implied 
probability distribution of the price of the asset at a future time. 
Information about the probability distribution is made available within a 
time frame that is useful to investors, for example, promptly after the 
current option price information becomes available. 

[006] Implementations of the invention may include one or more of the 
following features. The data may represent a finite number of prices of 
options at spaced-apart strike prices of the asset. A set of first differences 
may be calculated of the finite number of prices to form an estimate of the 
cumulative probability distribution of the price of the asset at a future 
time. A set of second differences may be calculated of the finite number of 
strike prices from the set of first differences to form the estimate of the 
probability distribution function of the price of the asset at a future time. 

[007] In general, in another aspect, the invention features a method in 
which a real time data feed is provided that contains information based on 
the probability distribution. 

[008] In general, in another aspect, the invention features a method that 
includes providing a graphical user interface for viewing pages containing 
financial information related to an asset; and when a user indicates an 
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asset of interest, displaying probability information related to the price of 
the asset at a future time. 

[009] In general, in another aspect, the invention features a method that 
includes receiving data representing current prices of options on a given 
asset, the options being associated with spaced-apart strike prices of the 
asset at a future time. The data includes shifted current prices of options 
resulting from a shifted underlying price of the asset, the amount by which 
the asset price has shifted being different from the amount by which the 
strike prices are spaced apart. An estimate is derived from a quantized 
implied probability distribution of the price of the asset at a future time, 
the elements of the quantized probability distribution being more finely 
spaced than for a probability distribution derived without the shifted 
current price data. 

[010] In general, in another aspect, the invention includes deriving from 
said data an estimate of an impHed probability distribution of the price of 
the asset at a future time, the mathematical derivation including a 
smoothing operation. 

[Oil] Implementations of the invention may include one or more of the 
following features. The smoothing operation may be performed in a 
volatility domain. 

[012] In general, in another aspect, the invention includes deriving a 
volatility for each of the future dates in accordance with a predetermined 
option pricing formula that links option prices with strike prices of the 
asset; and generating a smoothed and extrapolated volatility function. 
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[013] Implementations of the invention may include one or more of the 
following features. The volatility function may be extrapolated to a wider 
range of dates than the future dates and to other strike prices. The 
smoothed volatility function may be applicable to conditions in which the 
data is reliable under a predetermined measure of reliability. The implied 
volatility function formula may have a quadratic form with two variables 
representing a strike price and an expiration date. The coefficients of the 
implied volatility function formula may be determined by applying 
regression analysis to approximately fit the implied volatility function 
formula to each of the implied volatilities. 

[014] In general, in another aspect, the invention features a method that 
includes receiving data representing current prices of options on assets 
belonging to a portfolio, deriving from the data an estimate of an implied 
multivariate distribution of the price of a quantity at a future time that 
depends on the assets belonging to the portfolio, and making information 
about the probability distribution available within a time frame that is 
useful to investors. 

[015] In general, in another aspect, the invention features a method that 
includes receiving data representing values of a set of factors that 
influence a composite value, deriving from the data an estimate of an 
implied multivariate distribution of the price of a quantity at a future time 
that depends on assets belonging to a portfolio, and making information 
about the probability distribution available within a time frame that is 
useful to investors. 

[016] Implementations of the invention may include one or more of the 
following features. The mathematical derivation may include generating a 
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multivariate probability distribution function based on a correlation among 
the factors. 

[017] In general, in another aspect, the invention features a graphical 
user interface that includes a user interface element adapted to enable a 
user to indicate a future time, a user interface element adapted to show a 
current price of an asset, and a user interface element adapted to show the 
probability distribution of the price of the asset at the future time. 

[018] In general, in one aspect, the invention features, a method that 
includes continually generating current data that contains probability 
distributions of prices of assets at future times, continually feeding the 
current data to a recipient electronically, and the recipient using the fed 
data for services provided to users. 

[019] In general, in another aspect, the invention features a method that 
includes receiving data representing current prices of options on assets 
belonging to a portfolio, receiving data representing current prices of 
market transactions associated with a second portfolio of assets, and 
providing information electronically on the probability that the second 
portfolio of assets will reach a first value given the condition that the first 
portfolio of assets reaches a specified price at a future time. 

[020] In general, in another aspect, the invention features a method that 
includes receiving data representative of actual market transactions 
associated with a first portfolio of assets; receiving data representative of 
actual market transactions associated with a second portfolio of assets; and 
providing information on the expectation value of the price of first 
portfolio of assets given the condition that the second portfolio of assets 
reach a first specified price at a specified future time through a network. 
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[021] In general, in another aspect, the invention features a method that 
includes evaluating an event defined by a first multivariate expression that 
represents a combination of macroeconomic variables at a time T, and 
estimating (e.g., using Monte Carlo techniques) the probability that a 
second multivariate expression that represents a combination of values of 
assets of a portfolio will have a value greater than a constant B at time T if 
the value of the first multivariate expression is greater than a constant A. 
The market variables represented by the first multivariate expression can 
include macroeconomic factors (such as interest rates), market preferences 
regarding the style of company fundamentals (large/small companies, 
rapid/steady growth, etc.), or market preferences for industry sectors. 

[022] In general, in another aspect, the invention features a method that 
includes defining a regression expression that relates the value of one 
variable representing a combination of macroeconomic variables at time T 
to a second variable at time T that represents a combination of assets of a 
portfolio, and estimating the probability that the second variable will have 
a value greater than a constant B at time T if the value of the first variable 
is greater than a constant A at time T, based on the ratio of the probability 
of x being greater than A under the regression expression and the 
probability of x being greater than A. 

[023] In general, in another aspect, the invention features a method that 
includes defining a current value of an option as a quadratic expression 
that depends on the difference between the current price of the option and 
the current price of the underlying security, and using Monte Carlo 
techniques to estimate a probability distribution of the value at a future 
time T of a portfolio that includes the option. 
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[024] The invention takes advantage of the realization that option prices 
for a given underlying asset are indicative of the market's prediction of th e 
of the risk-neutral price of the underlying asset in the future (e.g., at the 
expiration of the option). A risk-neutral price for an asset is one which 
would occur in a world in which investors had no aversion to for appetite 
for) the risks inherent in investments. Option price data may be used to 
derive the market's prediction in the form of an implied probability 
distribution of future risk-neutral prices. Additional e xplanation of tho 
significanc e of the phras e risk n e utral is contain e d in th e App e ndix. The 
shape of this distribution can then be used as a guide to understanding the 
distribution of future real-world prices. 

[025] The implied probability distribution and other information related 
to it may be made easily available to people for whom the information 
may be useful, such as those considering an investment in the underlying 
asset, or a brokerage firm advising such an investor. 

[026] Among the advantages of the invention are one or more of the 
following: Investors and prospective investors in an underlying asset, such 
as a publicly-traded stock, are given access to a key additional piece of 
current information, namely calculated data representing the market's view 
of the future price of the stock. Brokerage firms, investment advisors, and 
other companies involved in the securities markets are able to provide the 
information or related services to their clients and customers. 

[027] Other features and advantages will become apparent from the 
following description and from the claims. 
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DESCRIPTION 

[028] Details of implementations of the invention are set forth in the 
figures and the related description below. 

[029] Figures 1, 2, and 3 are graphs. 

[030] Figure 4 is a block diagram. 

[031] Figures 5, 6, and 7 are web pages. 

[032] Figures 8 and 9 illustrate user interfaces. 

[033] Figure 10 shows data structures. 

[034] In general, the price of a call or put option is determined by buyers 
and sellers in the option market and carries information about the market's 
prediction of the expected price of the underlying asset at the expiration 
date. (The information does not include the premium that investors require 
for bearing risk, which must be estimated separately. The average long- 
term value of the risk premium is about 6% per year for all stocks and may 
be adjusted for an individual stock's historical responsiveness to broader 
market movements.) 

[035] The information carried in the prices of options having various 
strike prices and expirations is used to derive probability distributions of 
the asset's price at future times and to display corresponding information 
to investors, for example, on the World Wide Web. 

Basic method 
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[036] We first define some relevant terms. We define x as the strike 
price, c(x) as the theoretical call price function (the price of the call as a 
function of strike price), p{x) as the theoretical put price function, F{x) as 
the cumulative distribution function (cdf) of the price of the underlying 
asset at expiration; and/x) as the probability density function (pdf) of the 
asset price at expiration. By definition, /x) = F'(x) (i.e., the probability 
density function is the derivative of the cumulative distribution function). 
We assume the options are exercisable only on the expiration date, i.e., 
they are "European-stvle" options. Even allowing for possible earlv 
exercise, T'American-stvle" options) most liquidlv traded call options 
without large dividends can be treated as if there were no possibility of 
such exercise, since sale of the option is usually a better alternative; 
therefore, these call options behave similarly to European-style options. 
The actual call and put prices are established by options market-makers. 
Such prices implicitly contain information about a market view of the 
probability distribution of the price of that asset at the expiration date. 

[037] In a simple but precise form, this market view can be stated as 
follows. Suppose that we were given the call price curve c (x) or the put 
price curve p(x) as a continuous function of the strike price x for all x > 0. 
Then, the second derivative of either the call or the put price curve is the 
market view of the risk-neutral probability density func tion (pdfl fix) of 
the asset price at the expiration date. The relationship between c(x),p(x), 
J{x), and F(x) can be succinctly stated as: 

F(x) = c'(x)+ 1 -p'(x); (la) 
f(x) = c"(x) = p"(x). (lb) 
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[038] In words, the pdf is the second derivative of either the call price 
function or of the put price function A simple proof these relationships 
follows is giv e n in the Appendix . The Appendix also contains oth e r 
detail e d information relating to the featur e s of th e inv e ntion. As in Hull 
we may calculate the European call or put price as an expected value in 
the risk-neutral distribution. 

[039] If the actual value of the asset on the expiration date is v. then the 
value of a call option at strike price x is maxlv - x. 01, and the value of a 
put option is max(x - v. 0). If the actual value is a random variable with 
pdf (v). then the expected value of a call option at jc at the expiration date 
is 

Cr (x) = [max{v - x,0}] = f (v - x)f{v)dv, (2a) 

and the expected value of a put option at x at the expiration date is 

Pr (x) = [max{x - v,0 } ] = [ (x - v)f{v)dv , (2b} 

[040] The current values c(x) and pfx) mav be obtained by discounting 
C7(x) and p i ix) by e"'^, where r is the risk-free interest rate, but for our 
purposes, forecasting probability distributions at time 7. we do no 
discounting, and henceforth just write cfx) = cAx^. p(x) = p -Kx). 

[041] Parenthetically, from these expressions we observe that 

p(x) - c(x) = (x - v) f(v)dv = x-E^[v] = x-s* (2c} 

where 5* = EJv] is the expected value of the asset at the expiration date 
under the risk-neutral distribution. (If there are no dividends, then ^* = 
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se^^: if there are dividends, then in general it is necessary to subtract from 
se^^ the value at time T of the dividends.) This well-known relation is 
called put-call parity: it shows why either price curve carries the same 
information. 

[042] From the above expression for c(x). it follows that its first 
derivative is 

c' (x) = - ^ f{x)dx = F{x) - 1 , (3) 

where F{x)= ^ f{v)dx is the cumulative distribution function (cdf) of 
the random variable v. To prove this, note that v-x~ ^ dx . Therefore 

c{x)=^{v-x)f{v)dv= [dvldzf{v)=^dz^dvf{v)=^dz{\-F{z)) 

where we interchange the variables v. z to integrate over the two- 
dimensional region 1 = ((v. z") : x < z < v) . The last expression implies 
that c'fa) = -(1 -F(x)\ 

[043] From put-call parity, it follows similarly that 

p'(x)= 1 +cTx) = F(xy (5} 

[044] Since the cdf and pdf are related by F'jx) = f(x), these expressions 
in turn imply that the second derivative of either c(x) or p(x) is the pdf f 

c"(x) = p"(x) = F'(x) = f(x). (6) 
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[045] The general character of the option price curves c(x) and p(x) is 
therefore as follows: 

• For all X less than the minimum possible value of v fi. e.. suth that 
F(x) ^ OV c(x) = EvFyl- x = s*-xmd p(x) = 0. In other words, c(x) 
is a straight line of slope -1 starting at c(0) = E.[v1 = >y*, while v(x) 
= 0. 

• For all jc greater than the maximum possible value of v (i.e., suth 
that F(x) ^ I). c(x) = 0 and p(x) = x - 5 *. In other words, p(x) is a 
straight line of slope +1 and x-intercept 5*, while c(x) = 0. 

• These two line segments are joined by a continuous convex U 
curve whose slope increases from -1 to 0 for c(x). and from 0 to +1 
for p(x), 

[046] We note that the fact that the mean EJjv] of the pdff(x) is s*. the 
value in future dollars at time 7 of the underlying price s (less the value of 
any dividendsV implies that option prices must be constantly adjusted to 
reflect changes in the underlying price s, even if there is no market activity 
in the options. 

[047] The fact that >y * = Ev[v1 also implies that an option price curve can 
make no prediction about the general direction of the underlying price s. 
However, the option price curve does predict the shape of the pdff(x). and 
in particular its volatility. 

[048] This so-called "second-derivative method" for computing implied 
probability distributions from option price data is known in the academic 
literature, but apparently not very well known. For example, the standard 
textbook "Options, Futures, and Other Derivatives," by John C. Hull 
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(Fourth Edition, 1999; Prentice-Hall) mentions implied probabilities, but 
not the second-derivative method. Perhaps the best reference that we have 
been able to find is J. C. Jackwerth and M. Rubinstein, "Recovering 
probability distributions from option prices," J. Finance, vol. 51, pp. 161 1- 
1631 (1996), which has only six prior references. This paper cites D. T. 
Breeden and R. H. Litzenberger, "Prices of state-contingent claims 
imphcit in option prices," J. Business, vol. 51, pp. 631-650 (1978) as the 
originator of a second-derivative method, although the latter paper 
nowhere mentions probabilities. 

[049] The risk-neutral distribution (at a fixed future time 7, for a fixed 

asset) is defined as the price distribution that would hold if market 

participants were neutral to risk, which they generally are not. However, 

many asset pricing theories, such as those underlying Black-Scholes 

option theory and most of the variations found in the Hull book above, 

allow for the true risk-averse asset price distribution to be obtained from 

the risk-neutral distribution fix) just by adjusting the latter by an 

appropriate risk premium: If there are no dividends, the true distribution is 

just f{xe^~''^^). where ^ - r is the expected annual return rate for the stock 

in excess of the risk free rate r. We use a variation on this simple format, 

slightly modified to allow for dividends (see below), though our invention 

could also work well with a more complicated adjustment. In this format, 

a value for ^ - r must still be supplied. We use as a default the "consensus 

estimate" taken from the textbook "Active Portfolio Management" (1995) 

by Grinold and Kahn. These authors note a long-term average value of the 

risk premium to be 6% per year, and suggest muhiplying this number by 

the stock's beta to get /u-r. The parameter beta is the slope of the line 

giving a regression of the stock in question against a market portfolio, 

often taken as the S&P 500. This is the well-known CAPM estimate for 
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the expected excess return. Whether good or bad, its stature as a consensus 
estimate makes it suited to our aim of providing a market view, though it 
is only a default. Our invention, which provides the risk-neutral 
component of the probabilities, could work with other estimates for the 
risk-averse adjustment parameter ^ - r and with any explicit scheme for 
adjusting the risk neutral probability density to the risk-averse probability 
density. It is worth pointing out that, for shorter time periods-even a month 
or two-the risk adjustment required is small and generally overwhelmed 
by fluctuations in the risk-neutral distribution itself 



Approximating f(x) from finite bid and ask option prices 

[050] Equations (la) and (lb) are obtained by assuming that the variable 
X is continuous and ranges from 0 to infinity. In practice, options are 
usually traded within certain price ranges and only for certain price 
intervals (e.g., ranging from $1 10 to $180 at $5 intervals). Thus, the call 
and/or put option prices are known only for a finite subset of strike prices. 
Under such circumstances, estimates of Equations (la) and (lb) can be 
computed by taking differences instead of derivatives as follows. 

[051] We assume that the option prices c{x) and p{x) are quoted for a 
finite subset of equally-spaced strike prices x^nl^, where n is an integer, 
and A is the spacing between quoted prices. Define Cn = c{n A), pn = p{n 
A). Then the first derivatives c'(x) and /?'(x) at x = («+ Vt)^ may be 
estimated by the first differences: 



£«±i_££L. (7a) 
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?' ,n = . (7b) 



[052] The corresponding estimates of the cumulative distribution 
function: F^^y^ = ^((^+ 2)^) are 

Lu2 = ^^ci,, (8a) 

[053] The second derivatives c"(x) and p"{x) at x = « A may Hkewise be 
estimated by the second differences, i.e., differences of the estimates of 
the first derivatives: 



cl' = ^ = "1 



(9a) 



(9b) 



[054] Either of these estimates of the second derivatives may be used as 
an estimate of the probabiHty density values atx = a7 A, i.e.,y(«A) : 

l-Korl-c: (10) 

[055] Moreover, the market prices of call and put options are usually 
given in terms of a bid-ask spread, and thus either the bid price or the ask 
price (or some intermediate value) may be used as the call or put option 
price. By using the bid and ask prices for both the call option and the put 
option, four estimates of Fix) and f(x) may be obtained. These estimates 
may be combined according to their reliability in any desired way. For 
example, one might use the estimate derived from the put bid price curve 
for values ofx less than the current price s of the underlying asset, and the 
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estimate derived from the call bid price curve for values of x greater than 
s. Examples of Cn, Pn, ^„+i/2 ^ /„ are shov^n in figures 1, 2, and 3 using 
the data of TABLE 1 (see belowV This combination will preferably take 
into account whether x = (n + '/2)A is much less than the underlying price s 
("deep out-of-the-monev"), near s ("near the money"! or much greater 
than s ("deep in-the-money"), according to the different patterns of setting 
bid-asked spreads in these different ranges. Another consideration is 
avoiding quotes near prices where early exercise is likely, such as deep in- 
the-money puts. 

[056] Similarly, the second derivatives c"(x) and p"{x) at x = nA may be 
estimated by the first differences of the estimates of the first derivatives: 



^ .4 .-1 -2C„ +C,_i 

A ' A^ ' ^ 



[057] We may take c"^ , or p"^ , or some combination as above as our 
estimate of the pdf f (nA). 

[058] Note that since f (x) > 0, option prices should satisfy a convexity 
condition, e.g., Cn+\ - 2c» + c^i > 0 for call option prices. Indeed, violation 
of this condition would allow making money via a risk-free "butterfly 
straddle" involving buying one call option (n + DA and another at (n - HA, 
and selling two call options at A similar result holds for put options. 
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Tabular data 



[059] TABLE 1 below shows sample bid prices of call and put options 
for strike prices of an asset ranging from $1 10 to $180 at $5 intervals and 
the cumulative distribution values F^^^/^ probability density values 
computed according to Equations (7)-(10) above. 

[060] In the table, the values for F„^,^2 correspond to strike prices that 
are mid- way between the two strike prices used to compute F^^^j^ . Thus, 
the cumulative distribution value shown to the right of the strike price 
$1 10 actually corresponds to the strike price $1 12.5, and the value to the 
right of the strike price $115 actually corresponds to strike price $1 17.5, 
and so forth. 
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strike 
price 


Call 
price 


^„.i/2 from 
call price 


/„ from 
call price 


Put Drice 


t.v2 from 
put price 


/„ from 
put price 


110 


42 7/8 


0 


0 


1/8 


0 


0 


115 


37 7/8 


0.025 


0.025 


1/8 


0.0125 


0.0125 


120 


33 


0.025 


0 


3/16 


0.0125 


0 


125 


28 1/8 


0.05 


0.025 


1/4 


0.0375 


0.025 


130 


23 3/8 


0.125 


0.075 


7/16 


0.0875 


0.05 


135 


19 


0.175 


0.05 


7/8 


0.15 


0.0625 


140 


14 7/8 


0.225 


0.05 


1 5/8 


0.2375 


0.0875 


145 


11 


0.35 


0.125 


2 13/16 


0.3875 


0.15 


150 


7 3/4 


0.525 


0.175 


4 3/4 


0.5 


0.1125 


155 


5 3/8 


0.6 


0.075 


7 1/4 


0.6 


0.1 






0.7375 


0.1375 


10 1/4 


0.725 


0.125 


165 


2 1/16 


0.825 


0.0875 


13 7/8 


0.825 


0.1 


170 


1 3/16 


0.8875 


0.0625 


18 


0.85 


0.025 


175 


5/8 


0.925 


0.0375 


22 1/4 


0.925 


0.075 


180 


1/4 






26 7/8 







Dynamic estimates for F(x) and f(x). 



[061] In Equations (7)-(10), the call and put option prices were assumed 
to be static in the calculation of the cumulative distribution function F{x) 
and probability density fiinction/x) for a finite subset of strike prices 
X = w A. In the real worid, the price s of the underlying asset changes with 
time, and there will be a corresponding change in option prices. As a first 
order approximation, if the price s increases by a small amount 5, then the 
option price curves will effectively shift to the right by the amount 5, 
(Her e , 5 may bo oithor positiv e or nogativo. For a moro preoiso discussion 
of the shift, soo the Appendix.) As a result, the price c(x) or p{x) now 
quoted at strike price x may be used as an estimate for the option price on 
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the previous price curve at strike price = x - 5 As a result, the prices on 
the previous curve at a new discrete subset of strike prices x = nl^- 5 
become effectively visible. 

[062] The methods considered in the previous subsection allow 
estimation of the cdf and pdf at a subset of A-spaced values of x based on 
a static set of option quotes at a particular time. 

[063] As previouslv noted, however, option prices must change 
continually in response to changes in the underlying price s. Let 5* denote 
the corresponding forward price at expiration (the price s evaluated with 
interest). Suppose this price (measured in dollars at expiration) moves up 
Tor down) by a small amount, an increment g in its logarithm, say, with 
little or no change in volatility. Here g may be viewed as, approximately, 
the percentage move dis * caused by a move of d in the (forward) stock 
price. We expect in this situation that (forward) probability distribution for 
the stock price will iust be shifted by g in the log domain. That is, the 
distribution will appear to be identical there, except with a mean shifted by 
g. Thus, the value of the new cdf at x = g'"^ is F(e^^^~^^^ = F(x/a\ where F 
denotes the original cdf with distribution mean 5*, and a = e^. A 
reasonable call price functional equation that gives the same effect, upon 
differentiation, is 

^c(5*, x/a) = c(as*. x), (12) 

where c(5*, x) denotes the price, in dollars at expiration, for a call option 
at strike x when the underlying is at price s. Note in this equation that all 
other variables, such as volatility, are assumed to be the same, which will 
only be approximately true, even for very small values of g. 
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[064] But, assuming this approximation, we can think of an option price 
at strike x. measured when the (forward) price has moved to as"^. for a 
near 1 , as giving instead a times the price of an option at strike x/a. but 
corresponding to the current underlying price s*. Considering all the 
strikes at which options are frequently quoted, and thinking additively, we 
can effectively observe c(x) (and p(x)) for a different subset of 
a pproximately equally-spaced strike prices, roughly x = nA-d{oT various 
values of ^ = £s*. Some care must, of course, be taken to ensure 
simultaneity of prices, of option and underlying. For this reason, we may 
prefer to consider the values of nA (corresponding to the various standard 
strike values) separately, and synchronize observed time of sales for an 
option at a given strike with the underlying security. Implied volatilities 
(discussed below) could be monitored, to ensure their changes relative to e 
were small. 

[065] Using a similar technique to that described in the above . 
paragraphs, meaningful average option prices for a given strike can also be 
computed, using thin strike intervals and using either short time intervals 
or time series methods (time averages weighting the present more than the 
past). Note that, without the framework described in this subsection, the 
computation of "average" option prices at a given strike are problematic 
when the stock price varies in the period over which the average is taken. 

[066] Given enough movements of the underlying price, therefore, we 
can effectively compute estimates of c(x),p(x), F{x) and/or/x) for a 
subset of strike prices x that is much more closely spaced than the subset 
available at any one time. 
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Extrapolating and smoothing probability distributions. 

[067] In a typical options market, the option prices are available only for 
certain expiration dates. In addition, the option prices are more reliable for 
options that are actively traded, which are typically nearer-term options at 
strike prices near the underlying price. It is therefore desirable to 
extrapolate and interpolate probability distributions to times other than 
actual expiration dates and to wider ranges of strike prices. 

[068] Any standard extrapolation and smoothing techniques may be used 
directly on the cumulative distribution values F^^^,j or probability density 
values to give a smoothed and extrapolated estimate ofFfx) or/fx). 
Similarly, given such estimated curves for a discrete subset of future times 
T, standard interpolation and extrapolation techniques may be used to 
estimate such curves for other specified values of T, or for a continuous 
range of r> 0. 

[069] A less direct but useful approach is to perform extrapolation and 
smoothing on an implied volatility function, which is then used to 
calculate the other functions, such as c(x),p(x), F(x), and/x). The 
volatility rate of an asset (often simply called its volatility) is a measure of 
uncertainty about the returns provided by the asset. The volatility rates of 
a stock may typically be in the range of 0.3 to 0.5 per year. 

[070] An advantage of performing extrapolation and smoothing on 
implied volatility curves is that different types of volatility curves (so- 
called "volatility smiles") are known and can be used as a guide to the 
extrapolation and smoothing process to prevent "overfitting" of certain 
unreliable data points. Many records have been kept of the volatilities 
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implied by option prices, and it is easy to examine how in the past they 
have changed with respect to price behavior. For example, the Chicago 
Board of Options Exchange makes public its average near-the-money 
volatiHty index (now called VIX) for S&P 100 options back to 1986. 
Finally, it is easier to work visually with volatility curves, which would 
theoretically be flat if f ix) were lognormaU than with visual differences in 
near-lognormal pdfs, which can all look very much alike. Mathematically, 
model improvements can be made in the volatility domain iust by 
changing coefficients of low-degree polynomial approximations, even 
though these affect higher-order terms in power series for the 
corresponding cdfs or pdfs. 

[071] The standard method of computing implied volatilities is to invert 
the Black-Scholes pricing formula (goo App e ndix) for the actual call price 
c(x) or put price p(x) of an underlying asset at a given strike price x, given 
the underlying price s (current price of asset), risk-free rate of interest r, 
and and T (expiration date). When this is done for a range of values of x, 
an estimate of an implied volatility curve o{x) is obtained. This curve may 
be smoothed and extrapolated by any standard method to give a smoothed 
curve a (x). Then corresponding smoothed put and call price curves may 
be computed using the Black-Scholes pricing formula and differentiated 
once or twice to give a smoothed cdf or pdf. Finally, given such estimated 
curves for a discrete subset of future times T, standard interpolation and 
extrapolation techniques may be used to estimate such curves for other 
specified values of T, or for a continuous range of r> 0. 

[072] For example, the standard Black-Scholes theory of option pricing 
(see Hull, op. cit.) yields a lognormal pd{ f(v) whose expected value is 
E v|"v] = 5*, such that In v is a Gaussian (normal) random variable with 
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variance o'T^ where the parameter o is called the volatility rate of the 
asset, and Tis the time to expiration. By a standard property of lognormal 
distributions, this implies that the mean of In v is E Jln v1-ln^*-a 772, 

[073] From this pdf follows the famous Black-Scholes call option pricing 
formula FHulL Appx. 1 1 A]: 

cix) = EJmaxI V -x,OU^s*N (dAx)) - xN (djjx)). (13) 

where (ci ] (x)) and (d? (x)) are values of the cumulative distribution 
function of a Gaussian random variable of mean zero and variance 1 at the 
points 

, , ^ ln(5 * /x) + c7^r/2 ^ r;: /, >,x 

d^ (x) = — ~= = d2{x) + CT^JT - Q4} 

\n(s'/x)-a'T/2 £,[lnvl-lnx 

[074] (Recall that our version of the call price is not discounted, and is 
given in dollars at time T, and that 5* is today's stock price, valued in 

dollars at time T, less the value of any dividends.) Note that is the 

standard deviation of In v: therefore -dyix) is lust In x, measured in 
standard deviations from the mean Eyfln v]. 

[075] Similarly, by put-call parity, we have the Black-Scholes put option 
pricing formula 

p(x) - c(x) + X - 5* = 5 W (d ^ (x)) ~l)-x(N (d,(x)) -l)=xN (-d,(x)) - 
5*iV(-c/i(x)). [ 06} 
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[076] Taking the derivative with respect to x, and using s* N' (d\(x)) 
jcA^^ (d7_(x)) and (x) = (x) (the latter equation holding under the 
assumption of constant volatility, which we v^ill later drop), we obtain 

F(x^ = cHx^ + 1 = -M^.rjc)) + 1 = M-d, (x)\ 07} 

[077] Now F(x) is the probability that v < x. which is equal to the 
probability that In v < In which since In v is Gaussian with mean EvFln v1 

and standard deviation is given by 



m = Pr(v<x) =Prlln v< lnx) = yV 



In X - [in v]^ _ 



a^^f 



= N(~dy (x)\ (m 



Thus we have verified that the Black-Scholes pricing formulas give the 
correct cdf HxV The derivative of Ffx) will thus yield the correct 
lognormal distribution f(x) = F Yx). 

[078] Now let F(x) be an arbitrary cdf on IB ^: i.e.. a function that 
monotonicallv increases from 0 to 1 as x goes from 0 to infinity. For 
simplicity we will assume that F(x) is strictly monotonicallv increasing: 
i.e.. f(x) = F'(x) > 0 everywhere. Then there exists a continuous one-to- 
one "warping function" y : 1 + such that F(x) = N(v(x)) everywhere: 
i.e., such that the probability that a random variable v with cdf Ffx) will 
satisfy v < x. is equal to the probability that a standard Gaussian random 
variable n with mean zero and variance 1 will satisfy n < v(x). S imilarly, 
there is an inverse warping function x(v) such that F(x (v )) = N (v), 

[079] Given the warping function v(x\ the cdf F(x) mav be retrieved 
from the relation F(x) = N (v(x)\ Therefore the cdf F(x) completely 
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specifies the warping function v(x). and vice versa: i.e., both curves carry 
the same information. 

[080] If Fix) is the cdf of a lognormal variable v such that In v has mean 
Ev[ln v1 ^ In 5* - &^T/2 and variance = a'T. as above, then the warping 
function is given by 

, , X lnx-(ln5*-c7^r/2) lnx-^,[hiv] 
y{x) = (x) = 1= = , Q9} 

(J^JT C7^ 

For this reason we may sometimes write v(x) as -doix), even when the cdf 
is not lognormal so that the right-hand equation above for djix) does not 
hold. 

[081] If f (x) is not lognormal then the Black-Scholes pricing formulas 
do not hold. Another new way to compute implied volatilities is first to 
compute a finite subset of cdf values F„^y2 ^hen to invert the Black- 
Scholes cdf formula (s ee App e ndix) at these values. -Given an option price 
c(x) or p(x). it is common practice to define the implied volatility a(x) as 
the value of a such that the Black-Scholes pricing formula holds, for a 
given X. s and T, The implied volatility curve aix) so defined is a function 
of the strike price x, which is constant if and only if the pdffix) is actually 
lognormal. In practice, it is typically a convex U curve, called a 
^Volatility smile." See, e.g., Hull Chapter 17. 

[082] When this is done for a range of values of an estimate of a 
generally different implied volatility curve cri(x) is obtained, called the 
cdf-implied volatility curve , which represents the value of a such that the 
Black-Scholes cdf formula Fix) = N{-di(x. cr, D) holds, for a given x. s 
and T . Again, this curve may be smoothed and extrapolated by any 
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standard method to give a smoothed curve a\(x). Then a corresponding 
smoothed cdf may be computed from the Black-Scholes cdf formula, and 
differentiated once to give a smoothed pdf. Finally, again, given such 
estimated curves for a discrete subset of future times T, standard 
interpolation and extrapolation techniques may be used to estimate such 
curves for other specified values of T, or for a continuous range of r> 0. 

[083] Some advantages of using the cdf-implied volatility curve rather 
than the conventional implied volatility curve are that the computations 
are simpler, at least from an estimate of Ffx) . In particular , it gives a 
simpler and arguablv more intuitive relationship between volatility and the 
cdf F(x). If we use the traditional implied volatility a(x\ then the 
relationship is instead 

Fix) = Ni-d2 (x)) + (x) , (20) 

da 

[084] Another advantage is a nd that it fits better with the multivariate 
techniques to be discussed below. 

[085] We have observed that the two curves (Hx) and mix) seem to be 
fairly similar, at least as to the direction of their slope, and are generally 
not too far apart in value "near the money." Also a ] (x) = aix) whenever 
aix) has zero slope, though a ] (x) is a little smaller than aix) when the slope 
a]ix) is negative (which often occurs for stocks). See the above equation. 
Finally, one function is as ad hoc as the other. Therefore, because of the 
above reasons, we generally prefer to use the cdf-implied volatility curve 
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[086] In any case, it is clear that either a(x) or aijx) contains the same 
information as any of the curves c(x) , p(x) . F(x) or f (x) . From a(x) or 
(T}(x) we can recover c(x) or Fix) using the Black-Scholes call option 
pricing or cdf formula, and from this we can obtain all other curves. 

[087] A particular method for finding a smoothed and extrapolated 
implied volatility curve cr\{x,T)asa function of both strike price x and 
time 7 to expiration is as follows. The volatilit)^ curv e is assumed to b e 
approximated by a quadratic formula 

[088] The volatility curve (Hx) or a\(x) may be calculated pointwise from 
the corresponding curve c(x) or p(x) to give a set of values at a finite 
subset of strike prices x. Each of these values may be deemed to have a 
certain degree of reliability. 

[089] It is then a standard problem to fit a smoothed and extrapolated 
curve to these points, taking into account their relative reliabilities. Any 
standard smoothing and extrapolation method may be used. In general, the 
usual problems of avoiding overfitting or oversmoothing must be 
addressed. 

[090] It is well-known that implied volatilities also vary with time. Wt! 
generally wish to estimate curves a{x, T) ox_ a, (x, T) as replacements for 

the constant volatility o in the Black-Scholes formulas, e.g., c(x) = c(x. o. 
DorF(x) = N(-d,(x. a. TS), 

[091] In an especially meaningful example, we have experimented with a 
class of smoothing algorithms used in "Implied volatility functions: 
Empirical tests," by B. Dumas, J. Fleming and R. E. Whaley, J. Finance, 
vol. 53, pp. 2059-2106, Dec. 1998. These authors fit an implied volatility 
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curve a(x). for the purpose of setting up a "straw man" option price model 
for testing (and defeating) a theory regarding the role of volatility in 
option pricing. Their ''straw man'' option pricing model c(x) was obtained 
by putting the resulting smoothed curve back into the Black-Scholes call 
formula. It is a "straw man" ad hoc model because no intuitive notion of 
stock volatility could possible vary with strike price, which the stock never 
"sees." Nevertheless, their model performed admirably, surpassing in 
predictive power the highly regarded "implied tree" method. One possible 
explanation offered was that their model mimicked in a smooth way 
interpolation methods actually employed by practitioners in the options 
markets. (See the discussion of "Volatility matrices" in Hull cited above,) 
Such an approach to option pricing seems ideal to us, because of its 
accuracy and because its underlying rationale represents a market view. 

[092] Thus, we use the Dumas-Fleming-Whalev model for our own 
entirely different purpose, that of forecasting probability distributions. All 
that is necessary is to differentiate their call price model, which, 
conveniently for us. is a smooth function of strike price and other standard 
variables such as time, current stock price, and the risk-free rate of 
interest. The formula for the cdf F(x) is, as before, this derivative with 1 
subtracted, or 

j7(jc) = N(-d,(x))+ —a'(xl (21} 

da 

[093] We can make this very explicit. We have 

— = x^ffN'i-dj (x)) (22} 

da 
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where N Yz) denotes standard normal density, while (f(x) may be 
computed by differentiating the Dumas-Fleming-Whaley fitted volatility 
curve. The latter has the form 

a ] (x,7) = ao + (3fi X + (32 a:^ + r+ ^j(4 7^ + ^5 ^ T, ([[14]] 23) 

[094] The coefficients {a/} are determined by regression to fit the 
available data regarding cr](x, 7) as closely as possible. Given the 
smoothed curve a i (x, 7), corresponding smoothed cdfs for different xs 
and Ts) may be computed from the Black-Scholes cdf formula for each 
time T, and differentiated once to give a smoothed pdf An alternative 
procedure, with numerical advantages, is to use a quadratic fit like the 
above for a function a (x,7), and then invert the Black-Scholes cdf to find 
cr ] (x, 7). S ee th e App e ndix for the acad e mic history of such 
approximations of- q ^(x,7). Another useful variation is to fit a (x, T) with a 
quadratic function of x at times T which are specific expiration dates, then 
linearly interpolate at other times T 

[095] This kind of quadratic curve-fitting is easily implemented, Dumas- 
Fleming-Whaley impose a constraint to prevent their volatilities from 
going below 0 (or even below 0.01), and we have imposed further 
constraints on extrapolations (which we often carry out beyond the range 
of their tests), to ensure that the final cdf does not go below zero or above 
one. We have experimented with other variations: on their basic approach, 
for example, using linear interpolation in the time domain, where we do 
not need-to take derivatives. Our methods would, of course work, with any 
approach, possibly quite different, to volatility curve-fitting, though the 
general Dumas-Fleming- Whalev approach has many things going for it: 
accuracy, conformity to marketplace use of Black-Scholes, smoothness 
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differentiability, in particular), conformity to historical experience 
regarding the smile structure of volatility curves (especially important for 
extrapolation), and simplicity (which, beyond ease of implementation, 
helps avoid overfitting). These advantages are achieved in a probability 
context that was not considered in the paper where these volatility curves 
were introduced. 

Treatment of multiple assets 

[096] The techniques described so far give probability distributions for 
the future values of a single asset based on option price data for that asset. 
However, in many cases an investor may be concerned with multiple 
assets, for example all of the stocks in his or her portfolio, or in a mutual 
fund, or in a certain index. The investor may also be concerned with a 
security without a quoted option or a security in a hypothetical scenario. 
Moreover, the investor may be concerned with the relations between one 
group of assets and another. 

[097] All of these questions involve considerations of several securities 
at once, and the probabilities of their simultaneous configuration of prices. 
This is clearly a consideration in stock portfolios and mutual funds, but 
also enters when dealing with securities without quoted option prices, 
where we would want to extract as much information as possible about the 
security from those correlated with it that do have quoted options. Finally, 
in scenario analysis there are many questions that involve considering the 
probabilities of several security prices occurring at once, including 
changes in factors influencing the market that might be modeled by 
changes in a portfolio of those securities most affected. 
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[098] For a portfolio of securities, or a mutual fund, we are interested in 
a composite asset of the form 



X = h \x & hy x r^ + hrpc„. (24) 

where the xy are all assets for which we individually know the cdf F(xi) or 
the pdf f (xi). To give our method the most flexibility, we do not require 
that this knowledge come from any particular procedure, though we favor 
the approach of the preceding two sections. However, even for some 
securities or indices with a quoted option, we might not feel there was 
sufficient option activity to justify a full fit of a volatility curve, and might 
take a cruder substitute, even a flat straight line based on an average of 
available implied volatilities. In addition, it is convenient to allow the 
possibility that a few assets we are monitoring might not have any quoted 
option at all: this is easily accommodated by, say, using a flat volatility 
curve with a historical value for volatility. For testing purposes and 
comparisons we might even want to consider a list of assets with volatility 
curves given this way. 

[099] A general method for dealing with such questions is to generate 
multivariate probability distributions for all assets of interest. A 
multivariate cdf may be written as F(x\, X2, . . x„), where the variables (x\, 
X2, . . Xn) are the values of the n assets of interest. 

[100] We will assume that we know from the techniques described above 
or otherwise the marginal cdfs Fifx-J for each of the individual variables. 
As a first step, we may define for each jCi a function yi(x\), called a 
"warping function," such that j^ifjcj) is a standard normal (Gaussian) 
variable with mean 0 and variance 1. This is simply done by defining j^ifxi) 
such that F\(x\) == Nfy/x J) for all values of x, where N(x) denotes the cdf of 
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a standard normal variable. The function yifxi) may be simply described in 
terms of c^ix.). S ee th e App e ndix. Under mild technical conditions such 

as having a marginal cdf that varies monotonically, such a warping 
function has a well-defined inverse warping function Xi(y[), 

[101] If the asset has an active options market, then the warping 
functions may be determined by either first estimating F(xi) directly from 
(finite differences of) options price data or by extrapolating and smoothing 
in the volatility domain. In the latter case we have an explicit form of the 
warping function vy fxA in terms of a fitted volatility curve (x^ , T) as 

V/fxy) = -d-yjxu ] (x, T) , D, and this equation can also be used with any 

volatility curve with the assets above, that might have fewer or no traded 
options. In a later section we will discuss portfolios in the logarithm 
domain, possibly containing long and short positions. One can think of 
warping them to standard normal directly, subtracting the mean and 
dividing by the standard deviation. Alternately, to keep our notation 
uniform, one can invent an asset with price x-, such that -Ai x, ) gives this 
warped value (using for aAx) the observed historical volatility). But we 
wish to emphasize that the method we are describing works with ANY 
single-variable warping functions, even using a different one for each 
variable. The only further substantive ingredient is the plausibility of using 
JOINTLY normal distributions, which we now discuss. 

[102] The general problem is to find a multivariate probability 

distribution for the complete set of variables (x ) x„). or equivalentlv 

for their logarithms. In simple financial models generalizing the Black- 
Scholes framework, the multivariate distribution of the logvariables is 
multivariate (i.e., jointly) normal: see Musiela and Rutkowski*s book 
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"Martingale methods in financial markets" (1999). This implies that all 
portfolios of these logvariables are jointly normaK and can also be used 
with other logvariables and portfolios of them to form a jointly normal 
distribution. Thus, if we wish, it is reasonable to use BARRA (or 
functionally equivalent) factors as single (log)variables in our model, 
using, say, individual normal distributions for them based on historical 
volatility. These factors may represent fundamentals of companies or even 
macroeconomic variables such as interest rates. We do not further discuss 
such factors, but refer to the book of Grinold and Kahn cited above, which 
also describes how to closely approximate them as portfolios of security 
returns. Our preference is to not use BARRA factors directly, but stay as 
much as possible in the worid of optionable securities, and address 
questions involving BARRA factors in terms of approximating portfolios 
consisting mostly of optionable securities. (But for testing and 
comparisons, it is still useful to be able to include them di rectly, and we do 
have that capability.) 

[103] Now we certainly do not wish to use only the simple 

multidimensional Back-Scholes model, which would not directly allow the 

non-lognormal input from our single-variable distributi ons based on the 

options markets. At the same time, option prices on indiv idual assets do 

not tell us anything about how assets interact, in particular, their 

correlations. Fortunately, correlations may be estimated from past 

(historical) data, and may be viewed as covariances for data that has been 

standardized (has standard deviation 1). Each multivariate normal 

distribution is determined by its mean and covariance matrix. Thus, a 

natural approach is to use the individual distributions to transform or 

"warp" the variables to standard normal, then impose a multivariate 

normal structure based on the correlation matrix. This procedure is 
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independent of the individual warping functions, which may be different 
for different individual variables, and in particular, can incorporate our 
market-based option distributions for individual variables representing 
securities with active options markets. A slightly different approach is to 
use correlations of the warped variables. This procedure is likely to be 
more accurate, but may involve more computational time, 

[104] Second, we assume that we can find the historical pairwise 
correlations between the warped standard normal variables These 
correlations may be computed by standard techniques from any available 
set of historical asset price data. We denote by C the nxn corr e lation 
matrix whos e e ntri e s ar e th e s e historically bas e d corr e lations historical 

correlation matrix of the log variables (In vi In v„\ whose entries are 

the cross-correlations 

E{lnv.\nv^-E(\nv•)E{lnv^ 

Pij= ' : (25a} 

Vg(lnv,)^£(lnv,)^ 

As before, it is notationally convenient to use v/ as a second notation for jcy, 
favoring the latter for fixed values and the former as a variable. Because 
each of the variables yifxij is standard normal, the diagonal terms^ of C 
are all equal to 1 , and C is a positive semi-definite covariance matrix, 
which we may here assume to be nonsingular (positive definite). 

[105] If we use instead correlations of warped variables, we have simply 

Pii ^ E(viVj ). (25b) 

[106] Now let Fc(x], . . Xn) denote the cdf of a multivariate Gaussian 

random «-tuple with zero mean and covariance matrix C. Define Thus, Fr 

(b\ b„) is the probability that each variable v, is at most some value bi. 
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There are more elaborate versions, such as F r (a] 

giving the probability that each v/ satisfies Oi < v; < bi. In the single- 
variable case these latter functions are obtained from the simple cdf by a 
single subtraction, involving two terms, but the corresponding bivariate 
case involves four terms, and in n dimensions there would be 2" terms. 
However, each of these more elaborate cdfs can be directly computed as 
an integral, just like the simple cdf. Since the more elaborate cdfs are 
needed for Monte Carlo calculations, possibly in high dimensions, it is 
best to think of them as being computed directly. 

[107] We then define the multivariate cdfs 

F(xuX2, •••,x„) = Fc(y}(xO,y2(xz), ^^^ynfxrj) (261 

and 

an \b\ _ bn) = F r (viffl j) vJaJ;vi(6i) Vnibn)) (27) 

where the V/te) are the known warping functions for the individual 
variables. We find it convenient, with some abuse of language, to speak of 

F(x] Xr,) as "the cdf, even though we have all of the above functions 

in mind, and to use F(x\ x J as a proxy for the whole distribution 

(which it does, theoretically, determine). Then F(x] , X2, . . . , Xn) is a 
multivariate cdf that (a) has the correct (given) marginal cdfs F/xJ', and 
(b) has the correct (historical) correlations between the warped standard 
normal variables ^-ifjcij. We use this cdf to answer questions involving the 

variables (x], X2, . . x»). That is, since the marginals of F r (zi z») are 

Gaussian with mean Q and variance 1 , the marginals of F(x] x») are 

equal to NM x j)) =^ Fix;): i.e., they are correct according to each single- 
variable model. If the logvariables (In vi In v») are actually jointly 

Gaussian, then the multivariate cdf F(xi jc^) is correct. 
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[108] In summary, the true joint distribution is approximated by a jointly 
lognormal distribution using historical correlations, combined with 
warping functions on each variable such that the marginal distribution of 
each variable is correct according to a selected single-yariable model ( for 
example, according to our single-variable model for optionable securities, 
or according to the lognormal model using historical volatility). The single 
variables may actually be portfolios, with a default distribution for the 
portfolio return being lognormal, based on historical volatility. This 
multivariate theory generalizes both our single-variable theory and 
standard multivariate f log)Gaussian models. It again allows for market 
input through option prices, to the extent that components have an active 
option market, but does not exclude non-optionable securities, and also 
allows portfolios as single variables. In this way BARRA (or functionally 
equivalent) factors are also allowed because of their interpretation as 
portfolios of long and short positions. 

[109] For example, the investor might have a portfolio consisting of a 
given quantity of each of these assets. The value of such a portfolio is the 
sum 

x = /zixi + /;2JC2+ ... (4^28) 

where hi is an arbitrary coefficient that represents the quantity of the /th 
asset in the portfolio. The investor might be interested in an estimate of 
the probability distribution of the value x of the whole portfolio. 

[110] Such an estimate may be obtained by Monte Cario simulation. For 
such a simulation, a large number N of samples from the multivariate 
Gaussian cdf Fdy\ , . . . , >^n) may be generated. Each sample (>i , . . . , >^n) 
may be converted to a sample (jci, JC2, . . x„) by using the inverse warping 
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functions x\(y\). The value x of the total portfolio may then be computed 
for each sample. 



[Ill] From these A'^ values of x, the probability distribution of x (e.g., its 
cdf F(x)) may be estimatedi 



[112] After enough samples, we will have an approximation to the cdf of 
X. More precisely, the probability that a<x<b\s. approximately, the 



and this approximation becomes exact in the limit for large sample sizes. 
This works for real portfolios, or for portfolios constructed from a 
number of ass& and a residual variable, as might arise from a regression. 
Usually the regression is done in the log domain, which we discuss below. 
Note that the Monte Carlo method just described works perfectly well if 

the expression for x above is replaced by any function /'fa x„) of the 

X/, possibly quite nonlinear. 

[113] In practice, it is useful to save the A'^ multivariate samples in a large 
database. Then the cdf of any quantity whose value is a function of the 
variables (xi, X2, . . x„) may be estimated from this database. For 
example, if the investor would like to know the cdf of some alternative 
portfolio with different quantities of each asset, this can be quickly 
determined from the stored database. 



X = h \x i(v\) + . . . + hMv.) , 



X29} 



average number of samples vi 



V g with a < hix\(v\) + ... + hf pc ^(Vf,) < b. 
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[114] An investor may also determine the effect of one portfolio (or 
event(s) or variables such as interest rates, P/E ratios, public interest in a 
certain sector of the market) on another portfolio as follows. Assume that 
the first portfolio is represented by x, where 



where each may be viewed as the price of a portfolio component, and 
the second portfolio is represented by where 



where each y, may be viewed as the price of a portfolio component or 
more broadly as any macro-economic variable (macroeconomic, 
fundamental, or sector related). 

[115] In some examples, these methods can be used with another 
paradigm in common use in the financial communitv. It is common to 
work in the return domain, or equivalentlv, with logarithms: i.e., 

Ln \ = P ] _\nx ] _ + ... ft„\nx„ . (32) 

[116] Ignoring anv possible identification of these variables with those 
used above, the same discussion and Monte Carlo method as above 

applies, if we regard x as a nonlinear portfolio x = f(x\ Xn)^ exp(^i In 

x^ + ... ^„ In xj. If the sum B of the yg/s is 1, such an x may be written x = 
^jX] +^2^2 + • • + ^n^n whcrc fij = ftix/x^. Even if B is not 1, incremental 

changes ("returns") dlnx computed from this equation for x are consistent 
with the above expression for In x. It is common in the financial 
communitv to think of as approximately a constant hu so that for short 
periods, where the x/S do not change too much, this equation for x is 



X = X] + /?2 X2 X, 



(30) 



y = g] X} +g2X2 + ... +g„X„. 



(31) 
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comparable to the portfolio equation in the previous subsection. Thus, 
h.^^ = p.^^ = pd\n so that for small changes dxi the change dx 

X 

from the first equation is approximately the same as would be obtained 
from the second. However, this relationship requires "rebalancing" to 
remain a good approximation for longer periods. 

[117] For an asset jc not given explicitly in terms of the terms of the x,, 
we obtain a similar expression via linear regression: 

In X = 0 ( i\n x n + 0 , In xi +--\-Bn In x.. (33} 

[118] The ^; for i 0 are correlation coefFicients chosen to minimize the 
variance of the residual in historical data (perhaps subject to constraints, 

such as /g/ > 0 and Y = 1 V For example, x might be a security 

without a quoted option, and the x, for i ?^ 0 could be taken as assets for 
which we individually know the probability distributions, in addition to 
the required correlation coefficients for x. We have written the residual 
term as /gn In xn (usually thinking of jgn = 1 and the residual as normally 
distributed: For the residual term / = 0, we can use a constant variance or 
impose some generic non-constant structure based on observed behavior). 
The mean of the latter could be nonzero, giving the regression "alpha" - a 
constant term making the mean of the regression correct. Alternatively, we 
could modify the equation to allow an explicit alpha, and keep the residual 
mean zero. Another minor variation might include the addition of a 
dummy variable with constant return, to adjust the value of x up or down. 
In particular, this gives another way of adjusting the residual mean to zero. 
This equation gives the previous one as a special case if we allow jgn = 0. 
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[119] Another approach, which promises to be relatively fast 
computationally, is the following. As in the development of cdf-implied 
volatilities in Section 2, let us assume that each logvariable In above is 
^'Gaussian" with non-constant variance a(xifT, In other words, the cdf is 
given by F(xi) - N i-dj (xu 0](x i). TS), Our aim will be to give Fix) by a 
similar equation, using some kind of fitted curve o \ (x). We will assume 
that we have some class of volatility curves in mind, with a small number 
of parameters which must be determined. 

[120] If the variables iln xn In jc„, were truly jointly Gaussian, then In 

X would also be Gaussian. Its variance would be given by the formula 

Var(Inx) = Y.PPiP,Pj^jT . Q4} 

where pij is the correlation between In Xi and In xy, and g?T= VarOn xj). We 
therefore define the estimate (T, (x) of 0]ix) by the conditional expectation 



6-,(x) = £' 




lnx = ^y5Jnx, 






/ J 



[121] The calculation of the above conditional expectation may be done 
with Monte Carlo methods. In the language of nonlinear portfolios above, 

we would take the function f(x \ _ x„) to be 0 outside a thin 

multidimensional solid enclosing the hyperplane defined by In v = 

^ p. In V . ). Inside the solid we would take fix^ xj equal to the above 

expansion for Vardn x\ divided by the probability of being in the solid 
(also a Monte Carlo calculation). In terms of samples, we just take the 
average of Var fin x) over all the samples that end up inside the thin solid. 
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However, it is not necessary to compute all values of (x) . but only 
enough to fit the parameters h r the volatility curves we are using. 

[122] The estimated mean of In x would be In 5* \a^ixY . with 5* 

determined as before, or replaced with some risk-averse estimate, to obtain 
the risk-averse or "true" distribution. (It is common, incidentally, to use 
factor models such as these to estimate a risk-averse version of In 5* = ILfi i 
In 5;* from risk-averse values of 5/*.) 

[123] Also, we mention here one useful variation: We may prefer not to 
view the residual term In xn as part of the model, and instead write 

down a joint pdf only for In x/ In x„. In this case we can use the 

double expectation 



( ( 



cr,{x) = E 



\ iJ 



lnx = X>5,ln 



X, 



136} 



where the inner expectation is with respect to the variables xy, x? x», 

and the outer expectation is with respect to the residual. We might take the 
standard deviation cr(xn)f the residual (taking jgn = 1) as a constant, 
determined historically, or make an estimate based on some leverage 
model. 

[124] Now we can estimate the cdf Hx) by 

F{x)^N{-d^{x,a,{x\T)) (37} 

as in the univariate case. To summarize, we use our multivariate model to 
determine parameters for a univariate model of the portfolio. After that is 
done, we can obtain probabilities for the portfolio without having to go 
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back to the multivariate model, thus achieving a savings in time. We could 
take this one step further and think of randomly generating values of ai(x, 
D independently of any Monte Carlo philosophy (but perhaps still 
throwing away values of x too far out of the money), and then using the 
values obtained to do the regression required in the Dumas-Fleming- 
Whalev approach. 

[125] In another example, the multivariate distribution lends itself to the 
study of many questions regarding conditional probabilities. For example, 
suppose that we want to know the effect of an increase or decrease of 
some segment of the market on a portfolio, or of an increase or decrease of 
some macro-economic factor. BARRA, following earlier ideas of Ross 
viewed such macro-economic factors as portfolios with both long and 
short positions. Similarly. BARRA considers market segments associated 
to price-to-eamings ratios and other fundamental parameters, as well as to 
industry groupings, as portfolios. (See the book of Grinold-Kahn cited 
above.) Thus, we are led simply to consider the effect of one portfolio on 
another. 

[126] Consider the "what-if question: letting A and B be given positive 
constants, \^x >AzX time 7, what is the probability that > B at time T. 
This question can be answered by creating a Monte Carlo database as 
above for the multivariate cdf F(xi, xi, . . x„) corresponding to time T, 
identifying those samples for which x>A, and then using only these 
samples to estimate the probability thaty > B. More generally, any 
conditional cdf of the form F(x \ E) can be estimated similarly, where x is 
any function of the variables (xi, X2, . . x^) and E is any event defined in 
terms of the variables (xi, xi, , , ., x„). 
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[127] Similarly, suppose an investor would like to know whether it is 
reasonable to believe that a certain stock or portfolio x will have a value 
greater than a given constant A at time T, This kind of question can be 
addressed by estimating the conditional cdf of some other related and 
perhaps better-understood variable (or combination of variables) y at time 
r, given that x > ^. If the resulting distribution for does not look 
reasonable, then the investor may conclude that it is unreasonable to 
expect that X 

[128] For definiteness. let us suppose the first portfolio is x. where as 
above 

In X = y9n In xn + j/^l In x i ^ + , . . + jg„ In (38a) 

and the second portfolio is v, where 

In V =^ yn In vn + In Xj^ + . . . + v„ In x„ (38b) 

[129] We take = yn = K and view In xn and g = In vn as residuals with 
mean 0. The latter residual is not assumed to be a factor in our 
multivariate model. Consider the following typical "what-if question: Let 
A and B be given positive constants. If we know x > ^4 at time T, what is 
the probability that v > ^ at time 77 We give two approaches to this 
problem, the first probably quicken but possibly not as accurate, using a 
regression to avoid at least some Monte Carlo calculations. 

[130] In another example, consider In v > In ^ if and only if In y -c > In 
B -g. All correlations pij between In x/ and In xj are assumed known. We 
may also assume that we have historical values of volatilities 



expected values of implied volatilities, but it would not be difficult to 
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maintain an inventory of historical values, and more in the spirit of this 
part of the calculation to do so.) 

[131] Thus we can estimate the historical covariances between In x and 
In v-g: 

Cov(lnx,ln;; « "ZP^cr^Pijyj^jT . Q9} 



as well as a^^^ ='^Var(lnx),o-,„^_^ = ^Varfln v-e) and the correlation 



_ Cov{\nxMy-€) 

Py^) ~ P\nxMy-£ ~ • 



[132] This gives a standard regression for the variable In v - g expressed 
in standard deviations from its mean, in terms of a similarly standardized 
expression for In x. Note that g has mean 0 by construction. Put 

. . ln(s* /x)-a^T/2 . . 

d2(s*,x,a) = — -1= . Thus -di iS r*. ain r) measures 

ayT 

standardized In x using historical volatility, and -diix) = -dii s r^. jc. {x)\ 

measures "standardized" (warped) In x using the cdf-implied volatility 
curve a, (x) , as discussed in the previous section. Here ^r* denotes our 

best estimate for the value of jc at time T, 

[133] Let o \ iv. g) denote the volatility curve associated with In v - g 
which may be estimated as in the previous section (or computed from 
estimates of crj^fv) and the standard deviation of the residual, if we are 
willing to view the residual as uncorrelated with In v - g, as is guaranteed 
in unconstrained regression). Put djiv. e) = djjsy^^, ve\ a^(v :g) ), so that- 
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djiv. e) is a "standardized" measure of In v - Then the standard 
regression appropriate to our model is 

-do(v.8) = o(8)(-d,(x)). m 

[134] There is a residual associated with this regression, which we have 
not written down. It is presumably normal and its variance may be 
computed. For notational reasons we will just imagine it has been 
incorporated into the original g. As is apparent from the form of the 
expressions in the display, an alternative to the above regression is to do it 
with the warped correlation coefficients suggested in the previous section. 
If, in addition, it was appropriate to view the original portfolios as linear 
combinations of warped variables (our standard normal marginals), the 
regression above could be done without any recourse to Monte Carlo 
calculations. Similar remarks would apply if we used constant historical 
volatility functions throughout, though presumably the latter procedure 
would lose accuracy. 

[135] In any case, we can now answer our "what-if question as a simple 
expectation in the univariate normal distribution of the (adjusted) residual 
g. Abbreviate d7 is^:LA^ or,(A)) to d.JA) and d7i s^:;^^Ml ^^^(B,£:)) JoM^ 
g). Assume /?fg) > 0 (the natural case of a positive correlation). Then we 
have 

?r{y >B\x>A}= E{?r{~d,{y,e) > -d,(B,e)\-d^(x) > d,{A)) 

= Ei^T{~d,(x) > p(ey - d^{B,8) I -d^{x) > -d,{A)) . (41) 
= E\tim^Mm'\d2{B.s))IN{d,{A))^ 

[136] The first equation follows iust because -di (v, g) is monotonically 
increasing as a function of v: that is, the condition that v > ^ is completely 
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equivalent to the condition -di (v, e) > -dj (B. e). Similar remarks hold for 
the condition x>A. while the expression Priv >B\x>A) just means the 
probability that the condition v > ^ holds when it is known that x> A, The 
second equation is then derived with the displayed expression above for 
-dj (v. e). (If pis) is negative, the inequality involving its inverse reverses.) 
This inner expectation is then calculated in the normal distribution. For 
values of s for which -d^iA)) is as large as p{ey^(-dj(B. eS) the expectation 
is a certainty, and yields the value 1. When -d-AA) is smaller than 
picVH-diiB. g)), its cumulative normal distribution value N(-di(AS) is 
smaller than MoieTH-diSB. 8)\ and the probability 1 -Ni-doiAy) ^ 
MdyjA)) that the standard normal variable z = -t^^fx) is at least -diiA) is 
smaller than the corresponding probability 1 - N pisYU-dyiB. £)) = 
NpieYHdiiB. eY) that z be at least D(£r\-di(B. g). The ratio 
NipieyHdjiB. eWNidyjA)), which is the desired inner expectation, is thus 
smaller than 1, as is appropriate for a probability, conditional or not. If 
p(£) is negative, similar reasoning leads instead to the expression 
EfmaxlO, (N(d7(A))-N(p(eYUd,(B. eWMdjiA))]) for the desired 
conditional probability. 

[137] Although the final answer in either case is an expectation (over g), 

it is essentially an integral that could be computed quickly with power 

series. (A very simple and accurate power-series expansion of N(z) is 

given on p. 252 of the book by Hull cited above.) Using that, one could 

determine by iterative methods what value of g makes, say, the ratio 

NipieYHdiiB. £)))/N(di(A)) equal to 1, and then integrate the ratio against 

the standard normal pdf from -qo to the determined value of g in the 

p(£) > 0 case. Similar remarks apply if pie) < 0. (Note that, if pie) = 0, the 

variables In x and In y are uncorrelated, and the conditional probability 

PrIv > ^ I X > .4) is the same as the unconditional probability PrIv > B] .) 
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[138] All of the latter calculations can be done very fast. Of course, we 
have already used some Monte Carlo calculations to get this far, unless we 
are in the simplified context of constant volatility functions. 

[139] In some examples, it is easy to say how we would compute an 
answer to the same "what-if ' question, using our full joint probability 
distribution. We simply write 

Pr I V > ^ I X > ^ ) - E(?r(-do(v, e) > ^d,iB. e) I -di(x) > -dyiA)) (42) 

and interpret In x in -di(x). and In v - g in -d^ (^•^ terms of their 

expansions in In xn. In x/ In x„. To compute, say the inner expectation 

by a Monte Carlo calculation, we would generate a large number of 
random samples of multivariate standard normal vectors z with covariance 
matrix C. We then take the average, over the samples z which happen to 
satisfy z > -doiA). of the function which is 1 when -c/^fv. e) > -diiB. s) and 
0 otherwise. We have not experimented to see whether this method yields 
better answers than the regression procedure above. Nevertheless, it 
illustrates how we could approach more sophisticated "what-if questions 
that could not be easily treated by regressions. For example, suppose we 
believe that factor w will remain in a range C < w < D, and ask the same 
question about v, subject to the same condition on x. This is hard to 
formulate in terms of regression, end is simply not possible in terms of 
single-factor regression. However, it is easy to answer with the full 
distribution: 

Pr(v>^lx>^, C<w<Dl = 

EiPri-diiv. e) > -djJBx) I -dj(x) > d,(A)\ -diiO < -djjw) < -d,(D)) (43) 

[140] Finally, we may not want to work in the log domain, which, if we 
started with a fixed portfolio x = h ]X i + h7p c2 + h, pc n. would force us into an 
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a pproximation, as noted. But working with the full distribution, we can 
phrase a condition x > .4 as h ]X:(v} ) + . . .+ hr pcj[v „) > A. in the language of 
the first section where the vector of vs plays the role of our vector z here. 
Monte Carlo calculations can now proceed as before, using log domain 
expressions or not for the other conditions. 

[141] In the previous examples we were focusing on an investor thinking 
about the value of his or her portfolio v in response to the change in a 
factor X. Converselv> an investor might want to know what the investment 
world looks like if a given stock or index v goes to a certain level B at time 
T. What is the expected value A at time Tof another portfolio x. or simply 
of one of the factors x,? One approach is, upon input by the user that v is 
going to level B. to list several assets x/ or factors/indices x most highly 
correlated with v and their expected values with v at B, 

[142] It is also possible to display a confidence interval for each selected 
asset or factor, and have other information about its new projected 
probability distribution readily available. We could also offer comparisons 
with the old projected probability distribution of x. where no assumptions 
on V is made. Finally, in some cases, where it was possible to explain 
much of the variance of v with iust a few xy (appearing in the regression of 
v), we could list percentage increases/decreases of a portfolio of these x/ 
required to make B the expected value of v, based solely on its dependence 
on this portfolio. (For example, the coefficients in the portfolio could 
come from the regression of v with respect to all the xy. or some new 
regression might be done, perhaps allowing user-defined constraints). It 
should be mentioned that medians or modes are alternatives to expected 
values (means) here and above: in any case users will need to be educated 
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about the fact that the median and mode differ systematically from the 
mean in near lognormal distributions. 

[143] The main problem might be yiewed as understanding the 
probability distribution of x. giyen that v > ^ at a given time T. with x and 
V as in the previous section. This can be approached by the methods of the 
previous sections, by reversing the roles of the variables, 

[144] There is, however, a simpler question that can be treated in an 
especially quick way. Consider the problem of determining the mean of jc 
conditioned on the equality v = B at time T. The idea is to use simple 
regression methods, but interpret answers as measured in terms of our 
variable volatilities. In our previous notation, we have a regression 

-d,(x) = D-(-d,(v. g))-f V (44} 

where p (which we coded p(£) in the previous section) is the historically 
determined correlation between In x and the random variable In v - g. Note 
that the roles of dependent and independent variable are reversed. There is 
also a residual v. which has mean 0 here, and plays no role Tgets averaged 
away). Thus, the desired conditional expected value ^ of x is obtained 
from 

^,(A) = E(-cl,(x) \v^B)^o' E(^,(sv*. Be\ a,{y.s)^^ = o • (-d,(sy \ B. 

[145] Recall that fj/v,^) is an estimate, obtained by Monte Carlo 

methods, of the implied volatility (T ] _ associated to the random variable In v 

- g. For faster but less accurate calculations it can be estimated historically 

as Yju Pl^PuPf i with each of the as. ^s. and ps here given historically. (See 

the previous section for notation.) Similarly, for fast calculations, -diix) 
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could use historical volatility, though we expect it to be given more 
accurately, or raten more accurately according to the market view, as - 
diix) = -dj ls r"^. X. (x) ). using the implied volatility function estimate 

a, (x) . If X = is a single asset or index in our model, then (x)_^ a, (x.) 

does not require a Monte Carlo estimate, but is presumably already 
available. 

[146] To summarize, the conditional expected values required to answer 
such questions are easily obtained by regression methods. The accuracy of 
such answers is enhanced, or at least shaped more to reflect market input, 
when all logvariables are measured in "standard deviations," interpreted as 
our variable volatilities. 

[147] In some examples, these methods, when using full Monte Carlo 

calculations, easily apply to portfolios containing option securities. The 

well-known idea is to think of an option as a kind of nonlinear portfolio - 

a quadratic one, to be more precise. Thus, an option on a single underlying 

security with underlying price x^ has a price approximately x = c+A(jci - 

X] near su where the option was evaluated to a 

known value c. Here A and F are well-known parameters in the options 

markets, giving the first and second derivatives of the option price at S] 

with respect to the underlying security price xu Perhaps the most 

characteristic feature of options is that they have nonzero F - their 

proportion of increase or decrease with respect to the underlying security 

price changes as the security price changes. Explicit formulas in terms of 

other standard parameters are available, say, in the Black-Scholes theory 

for both A and F (see the Hull book cited above). Such formulas could be 

obtained by differentiation directly in other theories or when using 

empirically-fitted curves. In any case, once we have such an explicit 
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approximation to x its probability distribution is easily given by the 
Monte Carlo process described above. The same method applies as well to 
portfolios containing several options and other securities. 

Applications that use the probability distribution information 

[148] A wide variety of techniques may be used to accumulate and 
process the information needed for the calculations described above and to 
provide the information to users directly or indirectly through third parties. 
Some of these techniques are described below. 

[149] As shown in figure 4, the probability distribution information can 
be provided to users from a host server 102 connected to a communication 
network 104, for example, a public network such as the Internet or a 
private network such as a corporate intranet or local area network (LAN). 
For purpose of illustration, the following discussion assumes that network 
104 is the Internet. 

[150] The host server 102 includes a software suite 1 16, a financial 
database 120, and a conununications module 122. The communicafions 
module 122 transmits and receives data generated by the host server 102 
according to the communication protocols of the network 104. 

[151] Also connected to the network are one or more of each of the 
following (only one is shown in each case): an individual or institutional 
user 108, an advertisement provider 1 10, a financial institution 1 12, a third 
party web server 1 14, a media operator 122, and a financial information 
provider 106. 
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[152] The operator of the host server could be, for example, a financial 
information source, a private company, a vendor of investment services, or 
a consortium of companies that provides a centralized database of 
information. 

[153] The host server 102 runs typical operating system and web server 
programs that are part of the software suite 1 16. The web server programs 
allow the host server 102 to operate as a web server and generate web 
pages or elements of web pages, e.g., in HTML or XML code, that allow 
each user 108 to receive and interact with probability distribution 
information generated by the host server. 

[154] Software suite 1 16 also includes analytical software 118 that is 
configured to analyze data stored in the financial database 120 to generate, 
for example, the implied probability distribution of future prices of assets 
and portfolios. 

[155] The financial database 120 stores financial information collected 
from the financial information providers 106 and computation results 
generated by the analytical software 1 18. The financial information 
providers 106 is connected to the network 104 via a communication link 
126 or the financial information providers may feed the information 
directly to the host server through a dialup or dedicated line (not shown). 

[156] Figure 4 gives a fiinctional view of an implementation of the 
invention. Structurally, the host server could be implemented as one or 
more web servers coupled to the network, one or more applications servers 
running the analytical software and other applications required for the 
system and one or more database servers that would store the financial 
database and other information required for the system. 
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[157] Figure 10 shows an example of a data feed 150 sent from the 
financial information provider 106 to the host server 102 through the 
communication link 126. Information is communicated to the host server 
in the form of messages 151, 152. Each message contains a stream of one 
or more records 153 each of which carries information about option prices 
for an underlying asset. Each message includes header information 1 54 
that identifies the sender and receiver, the current date 155, and an end of 
message indicator 158, which follows the records contained in the 
message. 

[158] Each record 153 in the stream includes an identifier 156 (e.g., the 
trading symbol) of an underlying asset, an indication 158 of whether the 
record pertains to a put or call, the strike date 1 60 of the put or call, the 
strike price 162 of the put or call, current bid-ask prices 164 of the 
underlying asset, bid-ask prices 1 66 for the option, and transaction 
volumes 168 associated with the option. The financial information 
provider 106 may be an information broker, such as Reuters, Bridge, or 
Bloomberg, or any other party that has access to or can generate the 
information carried in the messages. The broker may provide information 
from sources that include, for example, the New York Stock Exchange 
and the Chicago Board of Options Exchange. 

[159] The financial database 120 stores the information received in the 
information feed from the financial information providers and other 
information, including, for example, interest rates and volatilities. The 
financial database also stores the results generated by the analytical 
software, including probability distribution functions with respect to the 
underlying assets and assets that are not the subject of options. 
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[160] The probability distribution information is generated continually 
(and essentially in real time) from the incoming options data so that the 
information provided and displayed to users is current. That is, the 
information is not based on old historical data but rather on current 
information about option prices. 

[161] In addition, other soft information can be accumulated, stored, and 
provided to users, including fundamental characteristics of the underlying 
assets, including prices, volatility values, beta, the identification of the 
industry to which the asset belongs, the yield, the price to book ratio, and 
the leverage. Other information could include calendars of earnings 
forecast dates, earnings forecasts, corporate action items, news items that 
relate to an industry, and the volume of institutional holdings. 

[162] The messages from the information provider 106 may be sent in 
response to requests by the host server 102, the information may be sent to 
the host server 102 automatically at a specified time interval, or the 
information may be sent as received by the information provider fi-om its 
sources. The financial database 120 may be maintained on a separate 
server computer (not shown) that is dedicated to the collection and 
organization of financial data. The financial database is organized to 
provide logical relationships among the stored data and to make retrieval 
of needed information rapid and effective. 

[1631 The user 108 may use, for example, a personal computer, a TV set 
top box, a personal digital assistant (PDA), or a portable phone to 
communicate with the network 104. Any of these devices may be running 
an Internet browser to display the graphical user interface (GUI) generated 
by the host server 102. 
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[164] The host server 102 may provide probability distribution 
information on the network 104 in the form of web pages and allow the 
individual user 108, the financial institution 112, the third party web 
server 1 14, and the media operator 124 to view the information freely. The 
host company that runs the host server 102 may generate revenue by, for 
example, selling advertisement space on its web pages to an advertisement 
provider 110. 

[165] The host server 102 may also provide proprietary information and 
enhanced services to individual users 108, financial institutions 1 12, third 
party web servers 1 14, and media operators 122 for a subscription fee. 

[166] The host server 102 may have a direct link to the financial 
institutions 1 12 to provide tailored information in a format that can be 
readily incorporated into the databases of the financial institutions 1 12. 
Financial institutions 1 12 may include, for example, investment banks, 
stock brokerage firms, mutual fund providers, bank trust departments, 
investment advisers, and venture capital investment firms. These 
institutions may incorporate the probability distribution information 
generated by the analytical software 1 18 into the financial services that 
they provide to their own subscribers. The probability distribution 
information provided by the host server 1 02 enables the stock brokerage 
firms to provide better advice to their customers. 

[167] A third party web server 1 14 may incorporate probability 
distribution information into its web site. The information may be 
delivered in the form of an information feed to the third party host of web 
server 1 14 either through the Internet or through a dedicated or dial-up 
connection. 
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[168] Figure 10 shows an example of a data feed 1 82 sent from the host 
server 102 to the third party web server 1 14 through communication link 
128. Data feed 182 carries messages 184 that include header information 
186, identifying the sender and receiver, and records 188 that relate to 
specific underlying assets. 

[169] Each record 188 includes an item 190 that identifies a future date, 
a symbol 192 identifying the asset, risk-neutral probability density 
information 193 and cumulative distribution information 194. The record 
could also include a symbol identifying a second asset 195 with respect to 
the identified future date, and so on. Other information could be provided 
such as a risk premium value with respect to the risk-neutral values. 

[170] Examples of third party web servers 1 14 are the web servers of 
E*TRADE, CBS MarketWatch, Fidelity Investments, and The Wall Street 
Journal. The third party web server 1 14 specifies a list of assets for which 
it needs probability distribution information. Host server 102 periodically 
gathers information from financial information provider 106 and its own 
financial database 120, generates the probability distribution information 
for the specified list of assets, and transmits the information to the third 
party web server 1 14 for incorporation into its web pages. 

[171] Examples of the media operator 124 are cable TV operators and 
newspaper agencies that provide financial information. For example, a 
cable TV channel that provides stock price quotes may also provide 
probability distribution information generated by the host server 102. A 
cable TV operator may have a database that stores the probability 
distributions of all the stocks that are listed on the NYSE for a number of 
months into the future. The host server 102 may periodically send updated 
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information to the database of the cable TV operator. When a subscriber 
of the cable TV channel views the stock price quotes on a TV, the 
subscriber may send commands to a server computer to the cable TV 
operator via modem to specify a particular stock and a particular future 
date. In response, the server computer of the cable TV operator retrieves 
the probability distribution information from its database and sends the 
information to the subscriber via the cable network, e.g., by encoding the 
probability distribution information in the vertical blank interval of the TV 
signal 

[172] Likewise, a newspaper agency that provides daily transaction price 
quotes may also provide the probabilities of stock prices rising above 
certain percentages of the current asset prices at a predetermined future 
date, e.g., 6 months. A sample listing on a newspaper may be "AMD 83 
88 85 A40%", meaning that the AMD stock has a lowest price of $83, 
highest price of $88, a closing price of $85 that is higher than the previous 
closing price, and a 40% probability of rising 10% in 6 months. 

[173] The analytical software 1 1 8 may be written in any computer 
language such as Java, C, C+-+-, or FORTRAN. The software may include 
the following modules: (1) input module for preprocessing data received 
from the financial data sources; (2) computation module for performing 
the mathematical analyses; (3) user interface module for generating a 
graphical interface to receive inputs from the user and to display charts 
and graphs of the computation results; and (4) communications interface 
module for handling the communications protocols required for accessing 
the networks. 
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Web pages and user interfaces 

[174] A variety of web pages and user interfaces can be used to convey 
the information generated by the techniques described above. 

[175] For example, referring to figure 5, a GUI 700 enables a user 108 to 
obtain a range of financial services provided by the host server 102. The 
user 108 may see the implied.probabilities of future prices of marketable 
assets 706 having symbols 704 and current prices 708. The information 
displayed could include the probabilities 714 (or 718) of the asset prices 
rising above a certain specified percentage 712 (or falling below a certain 
specified percentage 716) of the reference price 710 within a specified 
period of time 720. 

[176] For the convenience of the user 108, GUI 700 includes links 730 to 
institutions that facilitate trading of the assets. The host company that 
runs the host server 102 sells advertising space 728 on the GUI 700 to 
obtain revenue. The GUI 700 also has links 726 to other services provided 
by the host server 102, including providing advice on lifetime financial 
management, on-line courses on topics related to trading of marketable 
assets, research on market conditions related to marketable assets, and 
management of portfolios of assets. 

[177] Referring to figure 6, the GUI 700 also may display an interactive 
web page to allow the user 108 to view the market's current prediction of 
future values of portfolios of assets. The past market price 734 and current 
market price 736 of the asset portfolios 732 are displayed. Also displayed 
is the price difference 738. The GUI 700 displays the probability 744 (or 
746) that the portfolio 732 will gain (or lose) a certain percentage 740 
within a specified time period 742. Examples of portfolios include stock 
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portfolios, retirement 40 IK plans, and individual retirement accounts. 
Links 748 are provided to allow the user 108 to view the market's current 
forecast of future price trends of the individual assets within each 
portfolio. 

[178] Referring to figure 7, in another user interface, the GUI 700 
displays an interactive web page that includes detailed analyses of past 
price history and the market's current forecast of the probability 
distribution of the future values of a marketable asset over a specified 
period of time. The GUI 700 includes price-spread displays 750 
representing the cumulative distribution values of the predicted future 
prices of an asset over periods of time. The price-spread display 750a 
shows the price distribution data that was generated at a time three months 
earlier. A three-month history of the actual asset prices is shown as a line 
graph for comparison to give the user 108 a measure of the merit of the 
price distribution information. The price-spread display 750b represents 
the predicted cumulative distribution values of the asset prices over a 
period of one month into the future. The left edge of display 750b, of 
course, begins at the actual price of the asset as of the end of the prior 
three-month period, e.g., the current DELL stock price of $50. The 
probability distribution information implies, for example, a 1% probability 
that the stock price will fall below $35, and a 99% probably that the stock 
price will fall below $80 in one month. GUI 700 includes table 752 that 
shows highlights of asset information and graph 754 that shows sector 
risks of the asset. A box 755 permits a user to enter a target price and table 
757 presents the probability of that price at four different future times, 
based on the calculated implied probability distributions. 
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[179] Referring to figure 8, in another approach, a window 402 is 
displayed on a user's screen showing financial information along with two 
other windows 408 and 410 showing probability distribution information. 
The individual user 108 could have previously downloaded a client 
program from the host server 102. When the user is viewing any 
document, e.g., any web page (whether of the host server 102 or of 
another host's server), the user may highlight a stock symbol 404 using a 
pointer 406 and type a predetermined keystroke (e.g., "ALT-SHIFT-Q") to 
invoke the client program. The client program then sends the stock symbol 
as highlighted by the user to the host server 102. The host server 102 
sends probability distribution information back to the client program, 
which in turn displays the information in separate windows 408 and 410. 

[180] When the client program is invoked, a window 422 may be 
displayed showing the different types of price information that can be 
displayed. In the example shown, the "Probability distribution curve" and 
"Upper/lower estimate curves" are selected. Window 408 shows the price 
range of AMD stock above and below a strike price of $140 from July to 
December, with 90% probability that the stock price will fall between the 
upper and lower estimate curves. Window 410 shows the probability 
density curve/x) for AMD stock for a future date of 8/15/2000. The user 
may also specify a default function curve, such that whenever an asset 
name is highlighted, the default function curve is displayed without any 
further instruction from the user. 

[181] Tabular data such as those shown in TABLE 1 may be generated 
by the host server 102 and transmitted over the network 104 to devices 
that have limited capability for displaying graphical data. As an example, 
the individual user 108 may wish to access asset probability distribution 
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information using a portable phone. The user enters commands using the 
phone keypad to specify a stock, a price, and a future date. In response, the 
host server 102 returns the probability of the stock reaching the specified 
price at the specified ftiture date in tabular format suitable for display on 
the portable phone screen. 

[182] Referring to figure 9, a portable phone 500 includes a display 
screen 502, numeric keys 506, and scrolling keys 504. A user may enter 
commands using the numeric keys 506. Price information received from 
the host server 102 is displayed on the display screen 502, Tabular data 
typically includes a long list of numbers, and the user may use the scroll 
keys 504 to view different portions of the tabular data. 

|183] In the example shown in display screen 502, the AMD stock has a 
current price of $82. The cumulative distribution values F{x) for various 
fiiture prices on 8/15/2000 are listed. The distribution indicates a 40% 
probability that the stock price will be below $80 implying a 60% 
probability of the stock price being above $80. Likewise, the distribution 
indicates an 80% probability that the stock price will be at least $90, 
implying a probability of 20% of the stock price being above $90. 

[184] Other embodiments are within the scope of the following claims. 
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